feat(remote-config)!: agentless RC fetcher#2112
Conversation
|
📚 Documentation Check Results📦
|
Clippy Allow Annotation ReportComparing clippy allow annotations between branches:
Summary by Rule
Annotation Counts by File
Annotation Stats by Crate
About This ReportThis report tracks Clippy allow annotations for specific rules, showing how they've changed in this PR. Decreasing the number of these annotations generally improves code quality. |
🔒 Cargo Deny Results📦
|
| // TODO: We do not store the delegated targets metadata | ||
| // This will need to be revisited in order to support proper Uptane | ||
| // verification of the full configuration data. | ||
| // store(repo, &targets_path, &metas.delegated_targets).await?; |
There was a problem hiding this comment.
I think you can safely skip them, director repo already confirms it's from datadog, for your account, integrity checks and the routing predicate. What it does not confirm is the product behind the config
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f71677404b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| .uri(url_with_path(self.endpoint.url.clone(), path)?) | ||
| .method(method) | ||
| .body(body)?; | ||
| Ok(self.http.request(req).await?) |
There was a problem hiding this comment.
Honor endpoint timeouts for agentless requests
When agentless mode is pointed at a slow or unresponsive RC backend, this await is not bounded by Endpoint::timeout_ms. The agent path wraps its HTTP request in tokio::time::timeout(...), and NativeHttpClient only awaits hyper and body collection, so a stalled request can keep fetch_once and the SharedFetcher run loop stuck indefinitely instead of allowing cancellation or later polls. Wrap this request with the endpoint timeout before awaiting it.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f71677404b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| .uri(url_with_path(self.endpoint.url.clone(), path)?) | ||
| .method(method) | ||
| .body(body)?; | ||
| Ok(self.http.request(req).await?) |
There was a problem hiding this comment.
Honor endpoint timeouts for agentless requests
When agentless mode is pointed at a slow or unresponsive RC backend, this await is not bounded by Endpoint::timeout_ms. The agent path wraps its HTTP request in tokio::time::timeout(...), and NativeHttpClient only awaits hyper and body collection, so a stalled request can keep fetch_once and the SharedFetcher run loop stuck indefinitely instead of allowing cancellation or later polls. Wrap this request with the endpoint timeout before awaiting it.
Useful? React with 👍 / 👎.
| expire_unused_files: bool, | ||
| } | ||
|
|
||
| impl<'a, S: FileStorage> TargetCache<'a, S> { |
There was a problem hiding this comment.
It's okay, though this code shares quite a bit of common logic with the agent fetcher. E.g. would be trivial to collect/construct NewTarget in fetcher.rs and call store_batch.
I suppose that was done to move faster and avoid untangling the agent fetcher.
| target.path | ||
| ) | ||
| })?; | ||
| if stored.hash != target.primary_hash || stored.meta.length as u64 != target.length |
There was a problem hiding this comment.
What's the goal of re-verifying here? is_cached_batch() omits those entries which are having a different hash/len and fetch_target() handles verification of new ones?
(It's also the meaningful difference to the agentful fetcher path, hence I ask.)
There was a problem hiding this comment.
Hmm I just added the case to be a bit more defensive.
I agree it's not necessary though.
Motivation
In some environments (serverless, Datadog studio), tracers can't rely on a Datadog Agent to proxy Remote Config requests. This PR adds an agentless mode to libdd-remote-config so the fetcher can talk directly to the RC backend, mirroring what the Go agent does today.
The protocol is different from the agent's /v0.7/config: the backend speaks protobuf and ships raw TUF metadata + target files that the client must verify locally. This means embedding TUF trust roots per site and running a full Uptane-style validation on every poll.
Changes
How to test the change?
Also this has been tested in dd-trace-rs
DataDog/dd-trace-rs#263